Synthesizing Audio for Hindi WordNet

نویسندگان

  • Diptesh Kanojia
  • Preethi Jyothi
  • Pushpak Bhattacharyya
چکیده

In this paper, we describe our work on the creation of a voice model using a speech synthesis system for the Hindi Language. We use preexisting “voices”, use publicly available speech corpora to create a “voice” using the Festival Speech Synthesis System (Black, 1997). Our contribution is two-fold: (1) We scrutinize multiple speech synthesis systems and provide an extensive report on the currently available stateof-the-art systems. We also develop voices using the existing implementations of the aforementioned systems, and (2) We use these voices to generate sample audios for randomly chosen words; manually evaluate the audio generated, and produce audio for all WordNet words using the winner voice model. We also produce audios for the Hindi WordNet Glosses and Example sentences. We describe our efforts to use preexisting implementations for WaveNet a model to generate raw audio using neural nets (Oord et al., 2016) and generate speech for Hindi. Our lexicographers perform a manual evaluation of the audio generated using multiple voices. A qualitative and quantitative analysis reveals that the voice model generated by us performs the best with an accuracy of 0.44.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Creation of Lexical Relations for IndoWordNet

WordNet is an electronic lexical database available on-line as a powerful resource to the researchers in the area of computational linguistics, text processing and other related areas. WordNet for Hindi language has already been developed by IIT, Bombay. The Indian languages WordNets are being created using expansion approach from Hindi WordNet under IndoWordNet project. In expansion approach, ...

متن کامل

Introduction to Gujarati Wordnet

Gujarati language is the youngest member of IndoWordnet[1]. As a part of IndoWordnet project, Wordnet for Gujarati language is being developed from Hindi Wordnet using expansion approach. This paper reviews the Gujarati Wordnet development process. It describes the basic features of Gujarati language and evaluates suitability of Hindi language as a source language. Also, the current status of t...

متن کامل

Hindi Subjective Lexicon : A Lexical Resource for Hindi Polarity Classification

With recent developments in web technologies, percentage web content in Hindi is growing up at a lighting speed. This information can prove to be very useful for researchers, governments and organization to learn what’s on public mind, to make sound decisions. In this paper, we present a graph based wordnet expansion method to generate a full (adjective and adverb) subjective lexicon. We used s...

متن کامل

An Insight into Role of Wordnet and Language Network for effective IR from Hindi Text Documents

This paper investigates the limitations of traditional Information Retrieval (IR) models and how the semantic based approaches overcomes these limitations. Further the paper analyzes a range of aspects of language network representation of text corpus and how different network properties can lead to improve the results for different applications of IR. The paper analyzes Hindi Wordnet to exploi...

متن کامل

Merging Verb Senses of Hindi WordNet using Word Embeddings

In this paper, we present an approach for merging fine-grained verb senses of Hindi WordNet. Senses are merged based on gloss similarity score. We explore the use of word embeddings for gloss similarity computation and compare with various WordNet based gloss similarity measures. Our results indicate that word embeddings show significant improvement over WordNet based measures. Consequently, we...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017